Utilizing Microblogs for Automatic News Highlights Extraction

نویسندگان

  • Zhongyu Wei
  • Wei Gao
چکیده

Story highlights form a succinct single-document summary consisting of 3-4 highlight sentences that reflect the gist of a news article. Automatically producing news highlights is very challenging. We propose a novel method to improve news highlights extraction by using microblogs. The hypothesis is that microblog posts, although noisy, are not only indicative of important pieces of information in the news story, but also inherently “short and sweet” resulting from the artificial compression effect due to the length limit. Given a news article, we formulate the problem as two rank-then-extract tasks: (1) we find a set of indicative tweets and use them to assist the ranking of news sentences for extraction; (2) we extract top ranked tweets as a substitute of sentence extraction. Results based on our news-tweets pairing corpus indicate that the method significantly outperform some strong baselines for single-document summarization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

News Feature Extraction for Events on Social Network Platforms

Microblog-based social network platforms like Twitter and Sina Weibo have been important sources for news event extraction. However, existing works on microblog event extraction, which usually use keywords, entities, or selected microblogs to represent events, are not able to extract details of an event. Based on the view of news report, an event should present detailed news features, i.e., whe...

متن کامل

Discovering and Tracking Events From News, Blogs and Microblogs on the Web

Using three data sources, news, blogs, and microblogs, this study proposes a framework for discovering and tracking events embedded in free form online text. Existing methods for text mining are discussed for the three sources. Because three sources have different perspective, event analysis, region-topic model and rare keywords are proposed respectively. In order to integrate three data source...

متن کامل

Microblogs Data Management Systems: Querying, Analysis, and Visualization (Tutorial)

Microblogs data, e.g., tweets, reviews, news comments, and social media comments, has gained considerable attention in recent years due to its popularity and rich contents. Nowadays, microblogs applications span a wide spectrum of interests, including analyzing events and users activities and critical applications like discovering health issues and rescue services. Consequently, major research ...

متن کامل

Semi-automatic keyword based approach for FIRE 2016 Microblog Track

This paper describes our semi-automatic keyword based approach for the four topics of Information Extraction from Microblogs Posted during Disasters task at Forum for Information Retrieval Evaluation (FIRE) 2016. The approach consists three phases;

متن کامل

Using Signals of Human Interest to Enhance Single-document Summarization

As the amount of information on the Web grows, the ability to retrieve relevant information quickly and easily is necessary. The combination of ample news sources on the Web, little time to browse news, and smaller mobile devices motivates the development of automatic highlight extraction from single news articles. Our system, NetSum, is the first system to produce highlights of an article and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014